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SUMMARY  PAGE 


THE  PROBLEM 

To  determine  the  influence  of  word  predictability  in  sentence 
intelligibility  tests  which  are  used  to  evaluate  speech  discrimina¬ 
tion  ability  in  Navy  and  civilian  personnel. 


FINDINGS 

It  was  concluded  that  word  predictability  is  a  factor  which  in¬ 
fluences  sentence  intelligibility  and  that  careful  selection  of  key 
words  on  the  basis  of  their  predictability  status  could  affect  the 
overall  intelligibility  of  sentences . 


APPLICATION 

The  data  may  be  incorporated  into  speech  discrimination 
tests  used  to  evaluate  the  hearing  capabilities  of  Navy  personnel. 
The  results  also  apply  to  improvement  of  speech  intelligibility 
among  Navy  divers  during  deep  submergence  operations  and  in 
other  Navy  environments  where  a  degradation  in  speech  intelligi¬ 
bility  occurs. 
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ABSTRACT 


In  line  with  the  Naval  Submarine  Medical  Research  Labora¬ 
tory's  continuing  effort  to  improve  communication  in  the  Navy, 
this  study  was  instituted  to  investigate  the  relationship  between 
word  predictability  and  sentence  intelligibility.  This  relationship 
was  examined  by  comparing  the  accuracy  of  responses  by  listen¬ 
ers  to  several  lists  of  sentences.  Three  methods  of  scoring  used 
different  groups  of  key  words  which  had  previously  been  judged  to 
represent  different  degrees  of  predictability.  It  was  hypothesized 
that  tiie  scores  obtained  would  be  a  function  of  the  predictability 
status  of  the  key  words  used  in  scoring.  Results  indicated  signif¬ 
icant  differences  between  the  three  scoring  procedures  for  each 
sentence  list  under  two  filtering  conditions,  and  these  differences 
were  in  the  hypothesized  direction.  The  results  suggest  that  the 
use  of  easy-to -predict  words  will  increase  sentence,  intelligibility 
while  the  use  of  difficult -to -pr edict  words  will  depress  intelligi¬ 
bility.  It  was  concluded  that  word  predictability  is  a  factor  influ¬ 
encing  sentence  intelligibility  and  that  careful  selection  of  key 
words,  on  the  basis  of  predictability  may  be  a  way  of  controlling 
the  intelligibility  of  sentences. 


STUDIES  IN  NAVY  COMMUNICATION: 

THE  EFFECT  OF  WORD  PREDICTABILITY  ON  SENTENCE  INTELLIGIBILITY 


INTRODUCTION 

Monosyllabic  word  lists 1,2  have  en¬ 
joyed  widespread  use  in  The.  assessment 
of  speech  discrimination  ability,  pri¬ 
marily  due  to  the  ease  of  including 
phonetic  elements  in  aiprbpprtion.com- 
parable  to  their  relative  occurrence  in 
normal  conversational  speech.  Words 
most  commonly  used  can  be  incorpor¬ 
ated  minimizing  effects  of  vocabulary 
and  listeners'  intelligence.  Further¬ 
more,  such  tests  are  .easy  to  administer 
and  score.  However,  monosyllabic 
word  lists  do  not  adequately  sample 
factors  such  as  context,  stress,  accent, 
intonation  voice  quality  and  duration 
which  normally  provide  cues  to  intelli¬ 
gibility  in  conversational  speech. 
Some2’’*V)5  have  suggested  that  words 
embedded  in  sentences  may  be  a  more 
realistic  measure  of  reception  of  con¬ 
versational  speech. 

Although  the  use  of  sentences  may 
overcome  some  of  the  disadvantages 
encountered  with  monosyllabic  word 
lists,  several  characteristics  of  the 
average  sentence  should  be  investigated 
prior  to  recommending  their  general 
use.  One  such  characteristic  is  word 
predictability,  that  is,  the  property  of 
a  sentence  which  permits  prediction  of 
a  missing  word(s)  in  that  sentence. 
Because  of  varying  contextual  clues , 
the  predictability  status  differs  for 
words  within  a  given  sentence6,  as  well 
as  between  sentences  for  the  same 
word  ‘ , 


Since  word  predictability  may  play  an 
important  role  in  the  intelligibility  of 
connected  speech,  that  role  must  logi¬ 
cally  be  quantified  before  an  effective 
test  instrument  can  be  constructed. 

This  study  fills  that  need. 


PROCEDURE 

A.  Selection  of  Sentence  Lists.  Pre¬ 
dictability  values  for  several  sentence 
lists  were  determined  by  Giolas,  et  al6. 
In  that  study,  each  list  of  '.entences  was 
recorded  in  its  entirety  and  varying 
percentages  of  the  total  number  of  key 
words  In  each  list  were  then  eliminated 
by  splicing  out  the  randomly  selected 
key  words  and  replacing  them  with 
identical  amounts  of  leader  tape.  Each 
group  of  subjects  listened  to  the  sen¬ 
tence  lists  under  one  of  several  word 
elimination  conditions  and  were  asked 
to  write  down  what  they  felt  were  the 
missing  words.  Analysis  of  these  re¬ 
sults  indicated  that  the  C.I.D.  Sentence 
Lists  B  and  Dfi  and  the  Revised  C.I.D. 
Sentence  List  C9  provided  a  wide  range 
of  predictability  values  for  the  key 
words  within  each  list.  Consequently, 
these  lists  were  selected  for  use  in  the 
present  study. 

B.  Preparation  of  Stimulus  Material. 
Sections  of  the  master  tape  of  the  prior 
study  were  used6.  In  order  to  increase 
error  responses  to  avoid  the  "ceiling- 
effect",  we  re-reeorded  the  three  sen¬ 
tence  lists  Incorporating  low-pass 
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filtering  at  420  and  360  Hz  using  an  Al¬ 
lison  Model  2B  filter  (36  dB/oct).  A 
pilot  group  of  five  listeners  determined 
that  these  cut-off  points  yielded  scores 
ranging  between  30%  and  70%.  The  VU 
meters  of  the  playback  and  record  units 
were  matched  during  our  re-recording, 
and  a  1000-Hz  calibration  tone  was  in¬ 
serted  on  each  tape. 

The  practice  sentences,  five  for 
each  filtering,  were  inserted  on  the 
tape  before  the  actual  test  sentences  to 
minimize  response  errors  on  the  initial 
test  sentences  due  to  the  subjects'  un¬ 
familiarity  with  the  novel  listening  task. 

C.  Subjects .  Sixty  Submarine  School 
Candidates  were  used,  two  groups  of  f>0 
each.  Each  group  was  first  given  the 
taped,  pure -tone  audiometric  screening 
test  at  0.5-8  kHz.  Those  who  failed 
(Hearing  Level  25+  dB  re  ISO)  two  or 
more  frequencies  in  either  ear  or  at  the 
same  frequency  in  both  ears  were 
eliminated. 

D.  Presentation.  The  test  tapes  were 
played  on  an  Ampex  PR-10  tape  re¬ 
corder  through  an  Altec  1569A  ampli¬ 
fier  to  49  matched  TDII-39  earphones 
mounted  in  Otocups ,  in  a  room  con¬ 
sidered  to  be  a  good  listening  environ¬ 
ment.  Playback  level  was  established 
by  having  two  normal -hearing  people 
listen  to  the  tapes  under  test  conditions 
and  judge  a  comfortable  level  of  loud¬ 
ness. 

One  group  responded  to  monaural 
presentation  of  all  three  lists  with  the 
420  Hz,  the  other,  with  the  360  Hz  fil¬ 
ter.  Each  group  was  given  the  follow¬ 
ing  instructions: 


“ This  is  a  study  to  see  liow  well  you  can  under¬ 
stand  three  groups  of  sentences  which  arc  distorted 
in  a  certain  way.  Each  sentence  will  he  preceded 
by  its  number.  Then  you  will  hear  the  sentence. 

It  will  be  said  only  once,  so  listen  carefully.  Then 
I  will  stop  the.  tape  and  you  are  to  write  down, 
word  for  word,  the  sentence  you  heard.  If  you  are 
not  sure,  take  a  guess.  Try  to  respond  in  complete 
sentences,  and  do  your  best  not  to  leave  any 
sentence  blank.  Write  as  neatly  as  possible  and  re¬ 
frain  from  comjMring  your  answers  to  those  of 
others  as  this  will  affect  test  results.  Before  the. 
test  begins  you  will  listen  to  five  practice  sentences 
which  will  give  you  an  idea  of  the  kind  of  dis¬ 
tortion  you  will  be  listening  to.  l)o  not  write  these 
sentences  down.  After  listening  to  them  we  will 
begin  the  lest." 

After  listening  to  the  five  practice 
sentences,  subjects  were  allowed  to  ask 
any  questions.  Experimental  lists  were 
then  played  in  the  order  B,  D,  C,  with 
a  2-3  min.  rest  period  after  each  list. 

E.  Scoring.  A  subject's  score  with 
such  lists  is  usually  the  number  of  all 
50  key  words  in  each  list  correctly 
identified.  In  this  study,  we  also 
scored  the  number  of  correctly  identi¬ 
fied  words  from  among  the  20  most  and 
20  least  predictable  (\ 

Homophonous  words,  as  well  as 
identifiable  misspelled  words,  were 
accepted  as  correct.  Contractions  of 
words,  as  well  as  both  words  being 
spelled  out,  were  also  accepted  as  cor¬ 
rect. 


RESULTS  AND  DISCUSSION 

All  mean  data  are  in  Table  1.  In 
order  to  analyze  the  relationship  be¬ 
tween  scoring  procedures,  lists  and 
filterings,  a  three-way  analysis  of  var¬ 
iance  was  performed^.  Since  the 
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—  Scores  based  on  a  list  of  the  easiest  to  predict  words. 

=  Scores  based  on  a  list  of  the  most  difficult  to  predict  words 


scoring  procedures  used  two  different 
-numbers  of  keywords,  all  scores  were 
first  converted  to  percent  correct.  The 
results  are  in  Table  II. 

The  F  obtained  for  the  interaction  of 
filtering,  list,  and  scoring  was  hot  sta¬ 
tistically  significant.  It  was,  therefore, 
appropriate  to  look  at  each  factor  inde¬ 
pendently,  as  well  as  its  interaction 
with  any  one  of  the  other  two- factors. 


Bartlett's  test  for  homogeneity  of 
variance  was  performed  and  the  non¬ 
significant  results  of  this  test  offered 
no  evidence  that  the  variance  across  all 
conditions  was  not  sufficiently  homo¬ 
geneous  for  further  uncorrected  analy¬ 
sis  of  results. 

A.  Differences  Between  Filter  Condi¬ 
tions.  As  expected  and  as  can  be  seen 
in  Table  II,  a  significant  F  was  obtained 


TABLE  II 

RESULTS  OF  THREE-WAY  ANALYSIS  OF  VARIANCE  OF  THREE  SCORING 
PROCEDURES  USED  TO  EVALUATE  RESPONSES  FROM  TWO  GROUPS 
OF  SUBJECTS  TO  C.I.D.  SENTENCE  LISTS  B  AND  D  AND  REVISED 
C.I.D.  LIST  C  UNDER  TWO  LOW  PASS  FILTERING  CONDITIONS 

(420Hz  AND  3G0Hz) 


Source  of  Variation 

Sum  of  Squares 

df 

Mean  Square 

F 

p* 

A  (Filtering) 

87401.7 

1 

87401.7 

428 

.01 

B  (Scoring  Technique) 

17224.7 

2 

8612.35 

42.26 

.01 

C  (Sentences) 

24387.4 

2 

12193.7 

59.84 

.01 

Interactions 

AB 

25.2 

2 

12.6 

.06 

AC 

4084.8 

2 

2042.4 

10.02 

.01 

BC 

978.4 

4 

244.0 

1.20 

ABC 

317.0 

4 

79.4 

.39 

within  cell 

(experimental  error) 

106309.2 

522 

203.77 

Total 

240789.0 

539 

p*  =  point  in  the  F  distribution 
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n 


for  the  two  filtering  conditions  indi¬ 
cating  that  poorer  scores  were  obtained 
with  the  more  limiting  filter  condition. 

B.  Differences  Between  Sentence  Lists. 
Significant  F's  (.01  level  of  confidence) 
were  obtained  for  Lists  B,  D,  and  C 
(Table  I),  indicating  that  there  truly  are 
differences  between  these  sentence 
lists.  The  significant  interaction  be¬ 
tween  filter  settings  and  sentence  lists 
(.01  level  of  confidence)  further  indi¬ 
cates  that  the  two  filterings  had  differ¬ 
ential  effects  on  the  lists. 


To  probe  the  nature  of  the  differ-1 
ences  between  the  mean  intelligibilities 
for  each  list*  the  Newman-Keuls  meth¬ 
od  for  computing  critical  differences 
was  employed 10  (see  Table  III).  The 
means  collapsed  over  all  scoring  pro¬ 
cedures  for  each  list,  between  and 
within  conditions,  were  significantly 
different  at  the  .01  level  of  confidence, 
except  for  lists  B  and  D  in  the  420  Hz 
low-pass  condition;  thus  the  responses 
to  the  different  lists  were  not  the  same 
for  the  two  filterings,  and  the  question 
of  list  equivalency  for  Lists  B  and  D 


TABLE  HI 


EVALUATION  OF  THE  CRITICAL  DIFFERENCES  AS  OUTLINED  BY 
WINER  (1962)  OF  THE  DIFFERENCES  BETWEEN  SENTENCE  LISTS 
BETWEEN  AND  WITHIN  THE  TWO  FILTERING  CONDITIONS 
USING  THE  COMBINED  MEANS  OF  THE  THREE 
SCORING  PROCEDURES  FOR  COMPARISONS 


r 


420Hz  Low  Pass 
List  B  List  D  List  C 


List  B 


2.25% 


7.54%* 


420Hz 
low  pass 


360IIz 
low  pass 


List  D 
List  C 

List  B 
List  D 


9.81%* 


*  =  significant  at  the  .  01  level  of  confidence 


360Hz  Low  Pass 

List  B 

List  D 

List  C 

23.82%* 

35.09%* 

12.13%* 

11.57%* 

32.84%* 

9.88%* 

31.36%* 

42.63%* 

19.67%* 

11.27%*  11.69%* 
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22.96%* 


arises.  Undoubtedly,  the  equivaler,  ly 
of  both  the  original  and  the  revised 
C.I.D.  lists  should  be  investigated. 

C.  Differences  Between  Scoring  Pro¬ 
cedures.  Note  in  Table  II  the  signifi¬ 
cant  F's  (.01  level  of  confidence)  ob¬ 
tained  for  scoring  procedures.  The 
relationship  between  the  mean  scores 
for  each  scoring  procedure  (listed  in 
Table  I),  under  all  test  conditions,  is 
illustrated  in  Figure  1.  The  results  of 
a  Neuman-Keuls  test  for  critical  dif¬ 
ferences  indicate  significant  differences 
between  scoring  procedures  for  all  lists 
and  filterings  and  these  differences  are 
in  the  hypothesized  direction.  That  is, 
the  easy-tc-predict  words  consistently 
yielded  the  highest  scores,  the  difficult  - 
to-predict,  vhe  lowest  scores,  and  the 


full  lists  of  50  words  yielded  scores 
which  consistently  fell  intermediate. 

The  non-significant  (see  Table  II)  inter¬ 
actions  of  scoring/filtering  and  scoring/ 
sentences  further  indicates  that  differ¬ 
ences  between  scoring  procedures  were 
similar  for  all  lists  under  filtering. 

Although  differences  between  the 
easy-to -predict  and  the  difficult -to- 
predict  scores  were  sometimes  small, 
the  smallest  differences  being  6.7% 

(List  C,  360  Hz,  low-pass  filtering), 
and  the  differences  were  most  often  ap¬ 
preciable,  (15%  or  greater  in  4  of  6 
instances,  see  Figure  1),  the  differ¬ 
ences  between  the  full  list  scoring  pro¬ 
cedure  and  the  other  two  procedures 
was  usually  relatively  small  (differ¬ 
ences  ranged  from  2. 9%  to  9.1%). 


Hg.  I.  liar  graphs  illustrating  the  relationships  between  the  means  obtained  for  ereh  seoring  procedure 
within  each  sentence  list  over  the  two  filtering  conditions  as  well  as  the  results  of  the  critical 
differences  evaluation  for  each  of  the  scoring  procedures  within  each  sentt  list  and 

filtering  condition. 
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It  should  be  noted  that  when  key 
words  were  eliminated  in  the  study  of 
Giolas,  etal  6,  they  were  completely 
eliminated,  while  in  the  present  study 
all  key  words  were  present  in  the  mes¬ 
sage.  The  influence  of  additional 
acoustic  cues  in  the  present  investiga¬ 
tion  is  unknown,  but  it  is  certainly  pos¬ 
sible  that  the  acoustic  cues  have  altered 
the  predictability  status  of  the  key 
words,  imthe  sense  that  some  key 
words  were  acoustically  more  intelli¬ 
gible  than  others.  However,  any  sys¬ 
tematic  bias  toward  one  particular 
scorir  ■<;  procedure  seems  quite  unlikely. 

Wnen  scores  for  these  lists  are 
compared  to  isolated  word  intelligi¬ 
bility  11  > ,:> ,  it  is  quite  apparent  that 
under  several  low -pass  filterings,  sen¬ 
tences  are  considerably  easier  to  un¬ 
derstand  than  words.  Of  course  context 
is  well  known  to  influence  speech  intel¬ 
ligibility. 


CONCLUSIONS 

This  study  demonstrates  a  close  re¬ 
lationship  between  the  predictability  of 
words  and  the  intelligibility  of  sen¬ 
tences  incorporating  those  words .  The 
data  show  intelligibility  of  sentences 
can  be  partially  controlled  by  selecting 
key  words  for  predictability.  The  re¬ 
sults  suggest  that  if  isolation  of  param¬ 
eters  affecting  intelligibility  is  desir¬ 
able,  any  further  development  of  sen¬ 
tence  lists  for  use  as  tests  of  speech 
intelligibility  carefully  consider  the 
relative  predictability  of  the  key  words. 

Those  data  reveal  that  predictability 
is  an  important  factor  influencing  sen¬ 
tence  intelligibility  tests  such  as  used 


currently  and  probably  in  the  future  to 
evaluate  speech  reception.  These  re¬ 
sults  can, be  incorporated  into  further 
refinements  of  sentence  tests  designed 
to  evaluate  speech  reception  in  Navy 
personnel.  The  results  can  be  applied 
to  improving  speech  communications 
among  Navy  personnel- working  in  en¬ 
vironments  where  degradation  in  speech 
intelligibility  exists . 
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